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ABSTRACT 

This short precis outlines a collection of different 
strategies for visualising simple audio features for a GUI- 
based audio mixing interface that uses the stage metaphor 
control scheme. Audio features such as activity, loudness 
and spectral centroid are extracted in real-time and 
mapped to different visual cues that can be adapted to the 
circular widgets most often found in implementations of 
the stage metaphor. An initial evaluation suggests that 
while the visualisations are generally intuitive and provide 
information about activity of audio channels, they are not 
used directly. When implementing these kinds of dynamic 
graphical visualisations it is thus important to consider 
how intrusive they are compared to their usefulness in a 
real mixing context. 

1. INTRODUCTION 

Recent studies have showed that the stage metaphor (see 
Figure 1), first suggested by Gibson [1], outperforms the 
traditional channel- strip metaphor in several different 
ways [2, 3]. However, because the stage metaphor 
represents channels as circular widgets scattered around 
the 2D GUI space, there are often issues to do with clutter 
and lack of overview - especially when the amount of 
channels is high. Additionally, there are no standard 
visualisations of track activity, monitoring of levels or 
frequency content for single channels using this 
configuration. Drawing upon suggestions from earlier 
studies [4], we here present a collection of different 
graphical visualisation strategies. Common for these 
strategies is that they conform to the graphical style of the 


Figure 1: Left) Stage metaphor, where volume and 
panning are represented by distance and angle 
relative to a listening point. Right) Channel- Strip 
Metaphor, as seen in traditional mixing consoles. 


circular widget, as seen in for instance [5], and that the 
information they provide is to be used at a glance. Several 
related studies deal with visualisation of musical features 
in different contexts [6, 7]. What is important here is, that 
the visualisations should downplay the artistic expression 
of the graphics to be simplistic and useful, while still being 
both intuitive and aesthetically pleasing. 

2. VISUALISATIONS 

The visualisations presented here attempt to represent 
three overall audio properties for each channel: activity in 
terms of whether the channel is playing or not, 
instantaneous loudness of the channel and a representation 
of spectral brightness (here we compute the spectral 
centroid). In order to explore different strategies, several 
variation over the same prototype were designed and 
implemented: 

• Channel activity - to increase the perception of 
which channels are currently active, channels that 
have levels below a certain threshold are dimmed 
down. See Figure 2. 

• Monitoring of levels - here three prototypes are 
developed that map real-time audio levels of each 
channel to 1) size, 2) length of an angular line 
around each circle and finally 3) brightness of the 
circle. See Figure 3. 

• Monitoring of frequency - here three prototypes 
are developed that visualise the spectral centroid 
of each audio stream. One prototype maps the 
centroid to brightness. The other two prototypes 
implement a line around the circle that is induced 
with noise. This noise is increased with an 
increase in centroid brightness. One prototype 
represents a thin line around the circle - the other 
fills out the space entirely. See Figure 4. 

A great concern when dealing with visualisation in a 
mixing situation is the issue of overburdening the user 
with visual information, thus removing attention from 
listening. Therefore a prototype with no level, activity or 
frequency metering was made as well for user evaluation. 
The different features were built as an add-on to an 
existing iPad app presented in an earlier study [4]. 
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Figure 2: Channels that are inactive (below an 
amplitude threshold) are dimmed down. Here 
vocals and keys are inactive. 


SNARE 

^ 

SNARE 

SAXOPHO... 

A-GUITAR-1 SNARE 

EL-GUITAR 

BASS 

UITAR-1 KICK 

VOCALS 

a c\ i keys 

BASS ^ 

KICK A-GU 

VOCALS 

KICK 

. B \ bass KEYS 

VOCALS 

Figure 3: Mappings of instantaneous loudness: size, 
angular ring length and brightness. 
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Figure 4: Mappings of spectral centroid: brightness, 
rings induced with noise, same rings but filled out. 


3. INFORMAL EVALUATION 

An informal exploratory evaluation was carried out where 
6 professional audio engineers (average age: 33, average 
mixing experience: 12 years) were asked to first explore 
the mixing interface without any visualisation, while 
commenting on their experience. This session had three 
main purposes: 1) to let the participants get a feel for how 
the mixing interface worked, 2) to elicit any shortcomings 
of the interface that might be related to the lack of 
monitoring and 3) to serve as a reference in order to assess 
the importance of the different visualisation features to be 
tested. After having gained initial experience with the 
interface, test participants proceeded to explore the 
different variations presented earlier in a randomised 
order. 

For each variation they were asked to provide feedback as 
to how they experienced the different prototypes. At first 
they were not instructed to focus on the visualisations 
themselves. This was important in order to judge whether 
they naturally emphasised these visualisations in the 
feedback they provided or whether they put more 
emphasis on other features of the interface. If they would 
not mention the visualisations at all they would be 
explicitly asked whether they noticed them and what their 
thoughts were about them. At the same time, the fact that 
participants did not notice the visualisations would 
indicate to us that these particular visualisations were not 
important in that context. 


3.1. Results 

In general, results indicate that the visualisations proposed 
were not used directly (i.e. for identifying absolute 
differences between channel loudness or frequency 
content in the mix), but were regarded as pleasant and 
intuitive, and for the most part as supporting the audio. 
First indications suggest that brightness has to be used 
with care as especially rapid changes in brightness draws 
attention - attention, which is not desirable when for 
instance working on a different channel. While brightness 
can be used to represent loudness it is not a good 
representation of spectral brightness in this context, where 
the instantaneous spectral centroid shifts rapidly back and 
forth. This seemed unintuitive to most of the participants. 
Likewise, size changes (while being intuitive in terms of 
how they map to loudness) also can draw too much 
attention away from what the user is working on at a 
particular moment in time. Furthermore, channel widgets 
of different sizes also contribute to a cluttered interface. 
Channel inactivity, which was represented by dimming 
down the channel widget, was highly used for indications 
of which channels were currently playing and felt natural 
to the users. Spectral centroid mapped to a line induced 
with noise was intuitive though perhaps too subtle to be of 
direct use. Finally, loudness information mapped to both 
size of outer ring and angular ring length was appreciated 
for providing a sense of dynamics - again not as a direct 
tool. Whether that sense is important or takes away focus 
from listening would be interesting to test. 
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